Malaria-driven adaptation of MHC class I in wild bonobo populations

The malaria parasite Plasmodium falciparum causes substantial human mortality, primarily in equatorial Africa. Enriched in affected African populations, the B*53 variant of HLA-B, a cell surface protein that presents peptide antigens to cytotoxic lymphocytes, confers protection against severe malaria. Gorilla, chimpanzee, and bonobo are humans’ closest living relatives. These African apes have HLA-B orthologs and are infected by parasites in the same subgenus (Laverania) as P. falciparum, but the consequences of these infections are unclear. Laverania parasites infect bonobos (Pan paniscus) at only one (TL2) of many sites sampled across their range. TL2 spans the Lomami River and has genetically divergent subpopulations of bonobos on each side. Papa-B, the bonobo ortholog of HLA-B, includes variants having a B*53-like (B07) peptide-binding supertype profile. Here we show that B07 Papa-B occur at high frequency in TL2 bonobos and that malaria appears to have independently selected for different B07 alleles in the two subpopulations.

IK2876 x a LI5125 and TS5137 are identical to, but 19 bp longer than BX4799 (asterisks) and TL3798 (double crosses), respectively.    Table 8) because it is only characterized for  exon 3, and, therefore, its KIR epitope and supertype are unknown. d An "x" indicates alleles that were detected in wild populations (N=31) ( Table 2).

Supplementary Table 8. Odds ratio statistics testing the association between Papa-B KIR epitopes and peptide-binding
Supplementary Fig. 1 The Lomami River represents a barrier to bonobo gene flow. A maximum likelihood tree of previously reported and newly generated bonobo mitochondrial haplotypes with known sampling location is shown (haplotypes are listed in Supplementary Table  7). The tree was constructed in MEGA X 8 using the Maximum Likelihood method, with 10,000 replications, and the Hasegawa-Kishino-Yano (HKY) model 9 with a discrete Gamma (+G) distribution used to model evolutionary rate differences among sites. Bootstrap values are shown for branches with ≥80% support. Haplotypes are named by the two-letter site code for the population in which it was initially identified and an identification number, followed by the GenBank accession number. Haplotypes from bonobos East of the Lomami River (blue) form two clades. New haplotypes from bonobos sampled in TS, LY, and LI are bolded and boxed and cluster within Eastern clade 1 or Eastern clade 2, consistent with the Lomami River acting as a barrier to gene flow. Four haplotypes marked by arrows are those found in TL2-W. LI5125 and TS5137 are each identical to a previously identified sequence (BX4799 and TL3798, respectively), but are 19 bp longer (due to differences in the sequencing method).  Fig. 2). Bonobo and chimpanzee MHC-B allotypes are summarized by supertype in Supplementary Fig. 3. Shown for each allotype are the amino acid residues that MHCcluster predicts are enriched, in order of most-to-least enriched, at anchor positions 2 (P2) and 9 (P9) of a nonamer peptide. MHCcluster prediction accuracy values < 0.7 are in red and values of 1.0 are in bold. For each allotype grouping, the most preferred P2 and P9 residues are given (when more than one residue is strongly preferred across different allotypes then they are listed and separated by a slash; for some allotypes the second most preferred residue, which is distinguishing between groupings, is given in parentheses). Preferred residues given in blue font are experimentally determined motifs 11 , and an asterisk denotes that the observed binding motif differs from the predicted motif 12 . The B and F pockets of an MHC-B molecules bind the P2 and P9 residues, respectively. The MHC-B positions that form each pocket are listed, and positions that are monomorphic within Papa-B and Patr-B are coloured grey. The bonobo Papa-B consensus amino acid for each position is given at the top (bold), with dashes denoting identity to the consensus residue. Allotypes that are assigned to the same supertype group are similarly coloured. Those allotypes that clustered within a major peptide-binding "Group" based on MHCcluster 10 results ( Supplementary Fig. 2) but were more appropriately assigned to a different group are coloured accordingly. Two chimpanzee Patr-B with the Patr-B*08:01 binding profile have similarities with HLA-B*15:01 (B62 binding type) in the B pocket and P2 binding profile, as indicated by the shared grey shading. For most of the allotypes that were also assessed by de Groot et al. 12 , our assignments agreed. The exceptions are Patr-B*17:01 and 17:02 (darker brown, Patr-B*17:01 supertype) that de Groot et al. 12 assigned to the supertype group designated here at Patr-B*17:03 (lighter brown). We considered those Patr-B part of a distinct group because of their unique B-pocket and distinctive P2 residue preference.

Individual identification
Microsatellite sequences were PCR amplified as described 13  were each diluted to a 4 nM DNA concentration, and then the two libraries were pooled in equal volume to achieve a final sequencing library concentration of 20 pM. The library was MiSeq sequenced using 375 bp forward and 51 reverse cycles 13 . The resulting genotypes were determined using the CHIIMP microsatellite allele calling software 13 . Mitochondrial D-loop amplicons were amplified and sequenced as described for the other loci 13 , but in single rather than triplicate reactions and using 2 x 250 bp paired-end reads.

Papa-B exon 2 and exon 3 genotyping
The methods for the PCR and Sanger sequencing of exons 2 and 3 of the Papa-B gene 4 for samples from TL2-W, TL2-E, and BX, have previously been described 4,14 . Briefly, each exon was separately amplified using primers that anneal to the flanking intronic regions (Supplementary Table 10 After a second PCR clean-up, the resulting DNA concentration of each sample was quantified by the Qubit fluorometer using a dsDNA Assay (ThermoFisher). Each sample was then normalized to 4 nM and pooled by equal volume per sample into a single library for sequencing. The final library consisted of equal volumes of exon 2 and exon 3 amplicons (given their similar product size). To increase library complexity and sequence quality, the combined Papa-B exon 2 and 3 library was then pooled in an 80:20 ratio by volume with NGS libraries of different but similar-sized amplicons that were not part of this study, with an additional 20% spike-in of PhiX control library (Illumina).
Sequencing was performed using 2 x 300 bp paired-end reads.

Plasmodium screening of faecal samples
The 20 samples collected from TS, LY, and LI were screened for Plasmodium sequences as